AI029
Reinforcement Learning: An Introduction
Monte Carlo Methods
Learning Objectives
- Identify the core differences between Monte Carlo methods and dynamic programming.
- Explain the estimation of state-value functions using first-visit and every-visit Monte Carlo prediction.
- Apply Monte Carlo control to discover optimal policies using the policy iteration framework.
- Analyze the importance of the exploring starts assumption for policy convergence.
- Understand the distinction between on-policy and off-policy Monte Carlo methods using importance sampling.